Towards Heterogeneous Resources-Based Ambiguity Reduction of Sub-typed Geographic Named Entities
نویسندگان
چکیده
The aim of this work is to nd sub-typed Geographic Named Entities from the analysis of relations between Place Names surrounded nominal group within a speci c phrasal context in a set of textual documents. The paper presents a method involving natural language processing and heterogeneous resources like gazetteers, thesauri or ontologies. The work and the results focus a French language corpus. However, the uses of quite generic lexico-syntactic patterns in pre-selected phrasal context can be tuned for others languages. Keyword : natural language processing, Named Entity categorization, verbal relations, lexico-syntactic patterns, nite-state transducers
منابع مشابه
تشخیص اسامی اشخاص با استفاده از تزریق کلمههای نامزد اسم در میدانهای تصادفی شرطی برای زبان عربی
Named Entity Recognition and Extraction are very important tasks for discovering proper names including persons, locations, date, and time, inside electronic textual resources. Accurate named entity recognition system is an essential utility to resolve fundamental problems in question answering systems, summary extraction, information retrieval and extraction, machine translation, video interpr...
متن کاملPreservation of Social Web Content based on Entity Extraction and Consolidation
With the rapidly increasing pace at which Web content is evolving, particularly social media, preserving the Web and its evolution over time becomes an important challenge. Meaningful analysis of Web content lends itself to an entity-centric view to organise Web resources according to the information objects related to them. Therefore, the crucial challenge is to extract, detect and correlate e...
متن کاملEntity Extraction and Consolidation for Social Web Content Preservation
With the rapidly increasing pace at which Web content is evolving, particularly social media, preserving the Web and its evolution over time becomes an important challenge. Meaningful analysis of Web content lends itself to an entity-centric view to organise Web resources according to the information objects related to them. Therefore, the crucial challenge is to extract, detect and correlate e...
متن کاملBiological Nomenclatures: A Source of Lexical Knowledge and Ambiguity
There has been increased work in developing automated systems that involve natural language processing (NLP) to recognize and extract genomic information from the literature. Recognition and identification of biological entities is a critical step in this process. NLP systems generally rely on nomenclatures and ontological specifications as resources for determining the names of the entities, a...
متن کاملIdentification of Composite Named Entities in a Spanish Textual Database
Named entities (NE) mentioned in textual databases constitute an important part of their semantics. Lists of those NE are an important knowledge source for diverse tasks. We present a method for NE identification focused on composite proper names (names with coordinated constituents and names with several prepositional phrases.) We describe a method based on heterogeneous knowledge and simple r...
متن کامل